首页> 外文OA文献 >Compressed data structures: Dictionaries and data-aware measures
【2h】

Compressed data structures: Dictionaries and data-aware measures

机译:压缩的数据结构:词典和数据感知措施

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

[[abstract]]In this paper, we propose measures for compressed data structures, in which space usage is measured in a data-aware manner. In particular, we consider the fundamental dictionary problem on set data, where the task is to construct a data structure for representing a set S of n items out of a universe U = {0,..., u - 1} and supporting various queries on S. We use a well-known data-aware measure for set data called gap to bound the space of our data structures. We describe a novel dictionary structure that requires gap + O(n log(u/n)/log n) + O(n log log(u/n)) bits. Under the RAM model, our dictionary supports membership, rank, and predecessor queries in nearly optimal time, matching the time bound of Andersson and Thorup's predecessor structure [A. Andersson, M. Thorup, Tight(er) worst-case bounds on dynamic searching and priority queues, in: ACM Symposium on Theory of Computing, STOC, 2000], while simultaneously improving upon their space usage. We support select queries even faster in O(log log n) time. Our dictionary structure uses exactly gap bits in the leading term (i.e., the constant factor is 1) and answers queries in near-optimal time. When seen from the worst-case perspective, we present the first O(n log(u/n))-bit dictionary structure that supports these queries in near-optimal time under the RAM model. We also build a dictionary which requires the same space and supports membership, select, and partial rank queries even more quickly in O(log log n) time. We go on to show that for many (real-world) datasets, data-aware methods lead to a worthwhile compression over combinatorial methods. To the best of our knowledge, these are the first results that achieve data-aware space usage and retain near-optimal time.
机译:[[摘要]]在本文中,我们提出了用于压缩数据结构的措施,其中以数据感知方式来测量空间使用情况。特别地,我们考虑关于集合数据的基本字典问题,其中的任务是构造一个数据结构,以表示宇宙U = {0,...,u-1}中的n个项的集合S并支持各种我们对S进行查询。我们对集合数据使用一种众所周知的数据感知测度,称为间断,以限制数据结构的空间。我们描述了一种新颖的字典结构,该结构要求间隔+ O(n log(u / n)/ log n)+ O(n log log(u / n))位。在RAM模型下,我们的词典在几乎最佳的时间内支持成员资格,等级和前任查询,与Andersson和Thorup的前任结构的时限相匹配。 Andersson,M. Thorup,动态搜索和优先级队列的最坏情况界限,见:ACM计算理论研讨会,STOC,2000年],同时改善了它们的空间使用率。我们支持以O(log log n)时间更快的选择查询。我们的字典结构在前导词中恰好使用了间隔位(即常数因子为1),并在接近最优的时间内回答了查询。从最坏的情况来看,我们介绍了第一个O(n log(u / n))位字典结构,该结构在RAM模型下以接近最佳的时间支持这些查询。我们还构建了一个字典,该字典需要相同的空间,并且支持在O(log log n)时间内更快地进行成员资格,选择和部分排名查询。我们继续表明,对于许多(真实世界)数据集,数据感知方法导致比组合方法有价值的压缩。据我们所知,这是实现数据感知空间使用并保留接近最佳时间的第一个结果。

著录项

  • 作者

    Gupta, Ankur;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 [[iso]]en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号